AITopics | reference measure

Collaborating Authors

reference measure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Non-convex entropic mean-field optimization via Best Response flow

Neural Information Processing SystemsJun-13-2026, 18:13:20 GMT

We study the problem of minimizing non-convex functionals on the space of probability measures, regularized by the relative entropy (KL divergence) with respect to a fixed reference measure, as well as the corresponding problem of solving entropy-regularized non-convex-non-concave min-max problems. We utilize the Best Response flow (also known in the literature as the fictitious play flow) and study how its convergence is influenced by the relation between the degree of non-convexity of the functional under consideration, the regularization parameter and the tail behaviour of the reference measure. In particular, we demonstrate how to choose the regularizer, given the non-convex functional, so that the Best Response operator becomes a contraction with respect to the $L^1$-Wasserstein distance, which ensures the existence of its unique fixed point that is then shown to be the unique global minimizer for our optimization problem. This extends recent results where the Best Response flow was applied to solve convex optimization problems regularized by the relative entropy with respect to arbitrary reference measures, and with arbitrary values of the regularization parameter. Our results explain precisely how the assumption of convexity can be relaxed, at the expense of making a specific choice of the regularizer. Additionally, we demonstrate how these results can be applied in reinforcement learning in the context of policy optimization for Markov Decision Processes and Markov games with softmax parametrized policies in the mean-field regime.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Decentralized Machine Learning with Centralized Performance Guarantees via Gibbs Algorithms

Bermudez, Yaiza, Perlaza, Samir, Esnaola, Iñaki

arXiv.org Machine LearningApr-23-2026

In this paper, it is shown, for the first time, that centralized performance is achievable in decentralized learning without sharing the local datasets. Specifically, when clients adopt an empirical risk minimization with relative-entropy regularization (ERM-RER) learning framework and a forward-backward communication between clients is established, it suffices to share the locally obtained Gibbs measures to achieve the same performance as that of a centralized ERM-RER with access to all the datasets. The core idea is that the Gibbs measure produced by client~$k$ is used, as reference measure, by client~$k+1$. This effectively establishes a principled way to encode prior information through a reference measure. In particular, achieving centralized performance in the decentralized setting requires a specific scaling of the regularization factors with the local sample sizes. Overall, this result opens the door to novel decentralized learning paradigms that shift the collaboration strategy from sharing data to sharing the local inductive bias via the reference measures over the set of models.

artificial intelligence, machine learning, probability measure, (16 more...)

arXiv.org Machine Learning

2604.20492

Country:

Europe > Austria > Vienna (0.14)
Europe > France (0.05)
Oceania > French Polynesia (0.04)
(10 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Certified Generalization in Structured Prediction

Neural Information Processing SystemsFeb-12-2026, 15:30:27 GMT

Structured prediction is the task of predicting an output which itself contains internal structure. As an example, consider the problem of image segmentation.

artificial intelligence, inductive learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

Add feedback

Alpha Divergence Losses for Biometric Verification

Koutsianos, Dimitrios, Mosner, Ladislav, Panagakis, Yannis, Stafylakis, Themos

arXiv.org Artificial IntelligenceNov-25-2025

Performance in face and speaker verification is largely driven by margin-based softmax losses such as CosFace and ArcFace. Recently introduced $α$-divergence loss functions offer a compelling alternative, particularly due to their ability to induce sparse solutions (when $α>1$). However, integrating an angular margin-crucial for verification tasks-is not straightforward. We find that this integration can be achieved in at least two distinct ways: via the reference measure (prior probabilities) or via the logits (unnormalized log-likelihoods). In this paper, we explore both pathways, deriving two novel margin-based $α$-divergence losses: Q-Margin (margin in the reference measure) and A3M (margin in the logits). We identify and address a training instability in A3M-caused by sparsity-with a simple yet effective prototype re-initialization strategy. Our methods achieve significant performance gains on the challenging IJB-B and IJB-C face verification benchmarks. We demonstrate similarly strong performance in speaker verification on VoxCeleb. Crucially, our models significantly outperform strong baselines at low false acceptance rates (FAR). This capability is critical for practical high-security applications, such as banking authentication, when minimizing false authentications is paramount. Finally, the sparsity of $α$-divergence-based posteriors enables memory-efficient training, which is crucial for datasets with millions of identities.

artificial intelligence, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2511.13621

Country: Europe > Greece (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.55)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.55)
(2 more...)

Add feedback

Entropic optimal transport beyond product reference couplings: the Gaussian case on Euclidean space

Freulon, Paul, Georgakis, Nikitas, Panaretos, Victor

arXiv.org Machine LearningJul-3-2025

The optimal transport problem with squared Euclidean cost consists in finding a coupling between two input measures that maximizes correlation. Consequently, the optimal coupling is often singular with respect to Lebesgue measure. Regularizing the optimal transport problem with an entropy term yields an approximation called entropic optimal transport. Entropic penalties steer the induced coupling toward a reference measure with desired properties. For instance, when seeking a diffuse coupling, the most popular reference measures are the Lebesgue measure and the product of the two input measures. In this work, we study the case where the reference coupling is not necessarily assumed to be a product. We focus on the Gaussian case as a motivating paradigm, and provide a reduction of this more general optimal transport criterion to a matrix optimization problem. This reduction enables us to provide a complete description of the solution, both in terms of the primal variable and the dual variables. We argue that flexibility in terms of the reference measure can be important in statistical contexts, for instance when one has prior information, when there is uncertainty regarding the measures to be coupled, or to reduce bias when the entropic problem is used to estimate the un-regularized transport problem. In particular, we show in numerical examples that choosing a suitable reference plan allows to reduce the bias caused by the entropic penalty.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Machine Learning

2507.01709

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States > Michigan (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Linearized Optimal Transport pyLOT Library: A Toolkit for Machine Learning on Point Clouds

Linwu, Jun, Khurana, Varun, Karris, Nicholas, Cloninger, Alexander

arXiv.org Machine LearningFeb-5-2025

Instead, point clouds or continuous probability measures are the appropriate data structures. These data arise naturally in fields such as computer vision, image processing, shape analysis, and generative modeling, where representing complex objects as probability distributions provides a richer and more flexible framework for analysis. Real-world examples include text documents with bag-of-words models treating word counts as features, which forms a histogram for each document [35], imaging data where pixel intensity is interpreted as mass [26] and results in 2D discrete probability measures over the image grid, and gene expression data that is interpretted as a distribution across a gene network [8, 15]. Optimal transport (OT) theory [30] has recently emerged as a powerful tool to compare probability measures. Qualitatively, OT generates a distance metric between probability measures by minimizing the work needed to move one distribution into another over all transport plans. It has gained significant popularity for applications [4, 26, 27] involving point clouds and probability distributions. OT allows for the computation of distances between distributions by solving a minimization problem over transportation plans. Despite its theoretical elegance and its ability to capture geometric properties of distributions, using vanilla OT is computationally expensive and does not directly integrate into existing machine learning pipelines. For this reason, OT has been somewhat limited in practical applications, particularly in settings that demand scalable and efficient algorithms for tasks such as classification, dimension reduction, and generation.

artificial intelligence, barycenter, machine learning, (19 more...)

arXiv.org Machine Learning

2502.03439

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Synthesis and Analysis of Data as Probability Measures with Entropy-Regularized Optimal Transport

Mallery, Brendan, Murphy, James M., Aeron, Shuchin

arXiv.org Machine LearningJan-14-2025

We consider synthesis and analysis of probability measures using the entropy-regularized Wasserstein-2 cost and its unbiased version, the Sinkhorn divergence. The synthesis problem consists of computing the barycenter, with respect to these costs, of $m$ reference measures given a set of coefficients belonging to the $m$-dimensional simplex. The analysis problem consists of finding the coefficients for the closest barycenter in the Wasserstein-2 distance to a given measure $\mu$. Under the weakest assumptions on the measures thus far in the literature, we compute the derivative of the entropy-regularized Wasserstein-2 cost. We leverage this to establish a characterization of regularized barycenters as solutions to a fixed-point equation for the average of the entropic maps from the barycenter to the reference measures. This characterization yields a finite-dimensional, convex, quadratic program for solving the analysis problem when $\mu$ is a barycenter. It is shown that these coordinates, as well as the value of the barycenter functional, can be estimated from samples with dimension-independent rates of convergence, a hallmark of entropy-regularized optimal transport, and we verify these rates experimentally. We also establish that barycentric coordinates are stable with respect to perturbations in the Wasserstein-2 metric, suggesting a robustness of these coefficients to corruptions. We employ the barycentric coefficients as features for classification of corrupted point cloud data, and show that compared to neural network baselines, our approach is more efficient in small training data regimes.

barycenter, inequality, ot 2, (15 more...)

arXiv.org Machine Learning

2501.07446

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback

Linearized Wasserstein Barycenters: Synthesis, Analysis, Representational Capacity, and Applications

Werenski, Matthew, Mallery, Brendan, Aeron, Shuchin, Murphy, James M.

arXiv.org Machine LearningOct-30-2024

We propose the \textit{linear barycentric coding model (LBCM)} that utilizes the linear optimal transport (LOT) metric for analysis and synthesis of probability measures. We provide a closed-form solution to the variational problem characterizing the probability measures in the LBCM and establish equivalence of the LBCM to the set of Wasserstein-2 barycenters in the special case of compatible measures. Computational methods for synthesizing and analyzing measures in the LBCM are developed with finite sample guarantees. One of our main theoretical contributions is to identify an LBCM, expressed in terms of a simple family, which is sufficient to express all probability measures on the interval $[0,1]$. We show that a natural analogous construction of an LBCM in $\mathbb{R}^2$ fails, and we leave it as an open problem to identify the proper extension in more than one dimension. We conclude by demonstrating the utility of LBCM for covariance estimation and data imputation.

lbcm, optimal transport map, transport map, (13 more...)

arXiv.org Machine Learning

2410.23602

Country: Europe > Greece (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Learning signals defined on graphs with optimal transport and Gaussian process regression

Perez, Raphaël Carpintero, da Veiga, Sébastien, Garnier, Josselin, Staber, Brian

arXiv.org Machine LearningOct-21-2024

Due to the associated computational cost, machine learning (ML) is a natural In computational physics, machine learning candidate to accelerate such design exploration: has now emerged as a powerful complementary starting from an initial database of FEM simulations, tool to explore efficiently candidate designs a supervised model is trained to predict the FEM outputs in engineering studies. Outputs in such from its inputs and is ultimately used as a proxy supervised problems are signals defined on to evaluate new geometries with a negligible cost. But meshes, and a natural question is the extension in this context, the supervised learning task actually of general scalar output regression involves inputs given as meshes, which can be modeled models to such complex outputs. Changes as graphs with continuous node attributes, different between input geometries in terms of both number of nodes and edges. In addition, the outputs size and adjacency structure in particular can be scalar values but also physical quantities of interest make this transition non-trivial. In this work, defined on each node of the input graph, which we propose an innovative strategy for Gaussian we refer to as signals defined on graphs or fields.

artificial intelligence, machine learning, reference measure, (17 more...)

arXiv.org Machine Learning

2410.15721

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Filters

Collaborating Authors

reference measure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Non-convex entropic mean-field optimization via Best Response flow

Decentralized Machine Learning with Centralized Performance Guarantees via Gibbs Algorithms

On Certified Generalization in Structured Prediction

Alpha Divergence Losses for Biometric Verification

61674667d642ae52f6bb281bea90ee29-Paper-Conference.pdf

Entropic optimal transport beyond product reference couplings: the Gaussian case on Euclidean space

Linearized Optimal Transport pyLOT Library: A Toolkit for Machine Learning on Point Clouds

Synthesis and Analysis of Data as Probability Measures with Entropy-Regularized Optimal Transport

Linearized Wasserstein Barycenters: Synthesis, Analysis, Representational Capacity, and Applications

Learning signals defined on graphs with optimal transport and Gaussian process regression